Apply function one of the most important to row or column manipulation to apply a function to either rows or columns of a matrix or data frame. The apply function is a powerful tool for performing operations across rows or columns efficiently.
The primary purpose of using a row-wise apply function is to perform element-wise operations or computations on individual rows of the data and generate new values or summaries based on the row-level data.
Here are a few common use cases and benefits of using row-wise apply functions:
Element-wise operations: Sometimes, you need to perform calculations or transformations on elements within a row independently of other rows. Using a row-wise apply function allows you to conveniently apply a function to each row, processing the elements in isolation.
Feature engineering: When working with datasets, you might need to create new features based on existing ones. A row-wise apply function can be used to generate new columns in a DataFrame or matrix by applying a function to the values of each row.
Row-wise aggregation: In data analysis, you may want to summarize or aggregate data at the row level, like calculating the mean, median, sum, or any custom function of multiple values within a row. Row-wise apply functions enable you to perform these operations efficiently.
Conditional computations: When dealing with complex data structures, you may need to apply different computations to different rows based on specific conditions. Row-wise apply functions can handle such cases by allowing you to implement custom logic for each row.
Parallel processing: Depending on the implementation, some row-wise apply functions can take advantage of parallel processing, which can significantly speed up computations on large datasets.
When choosing to use a row-wise apply function, consider the size of your dataset, the complexity of the computation, and the available resources. As with any programming task, it’s essential to balance readability, maintainability, and performance.
R-Code
Using apply for Row-wise sum
<- data.frame(
df A = c(1, 2, 3),
B = c(4, 5, 6),
C = c(7, 8, 9)
)
df
A B C
1 1 4 7
2 2 5 8
3 3 6 9
$rowsum <- apply(df, 1, sum)
df df
A B C rowsum
1 1 4 7 12
2 2 5 8 15
3 3 6 9 18
Here in this apply function use in build sum function but we use custom function also. We take a function mul2each sum (each number multiply with own plus 2).
<- function(x){
mul2each = sum((x*x) + 2)
res return(res)
}
<- data.frame(
df2 A = c(1, 2, 3),
B = c(4, 5, 6),
C = c(7, 8, 9)
)
$result <- apply(df2, 1, mul2each)
df2 df2
A B C result
1 1 4 7 72
2 2 5 8 99
3 3 6 9 132
As per requirements create a function and link this function inside the main apply function.
Using apply for Column-wise sum
<- data.frame(
df3 A = c(1, 2, 3),
B = c(4, 5, 6),
C = c(7, 8, 9)
)
<- apply(df, 2, sum)
colsum colsum
A B C rowsum
6 15 24 45
Similarly as per requirements create a custom function and use it for finding column manipulation.
Python
Using apply for Row-wise sum
import pandas as pd
= pd.DataFrame({
df 'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]
}) df
A B C
0 1 4 7
1 2 5 8
2 3 6 9
Creating a custom function sum of (number *2 + 2)
def sum_of_squares(row):
return sum((row**2)+2)
= df.apply(sum_of_squares, axis=1)
row_sum print(row_sum)
0 72
1 99
2 132
dtype: int64
Using apply for Column-wise sum
= df.apply(sum_of_squares, axis=0)
column_sum print(column_sum)
A 20
B 83
C 200
dtype: int64